interactive scenario
Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
In an era marked by the rapid scaling of foundation models, autonomous driving technologies are approaching a transformative threshold where end-to-end autonomous driving (E2E-AD) emerges due to its potential of scaling up in the data-driven manner. However, existing E2E-AD methods are mostly evaluated under the open-loop log-replay manner with L2 errors and collision rate as metrics (e.g., in nuScenes), which could not fully reflect the driving performance of algorithms as recently acknowledged in the community. For those E2E-AD methods evaluated under the closed-loop protocol, they are tested in fixed routes (e.g., Town05Long and Longest6 in CARLA) with the driving score as metrics, which is known for high variance due to the unsmoothed metric function and large randomness in the long route. Besides, these methods usually collect their own data for training, which makes algorithm-level fair comparison infeasible. To fulfill the paramount need of comprehensive, realistic, and fair testing environments for Full Self-Driving (FSD), we present Bench2Drive, the first benchmark for evaluating E2E-AD systems' multiple abilities in a closed-loop manner. Bench2Drive's official training data consists of 2 million fully annotated frames, collected from 10000 short clips uniformly distributed under 44 interactive scenarios (cut-in, overtaking, detour, etc), 23 weathers (sunny, foggy, rainy, etc), and 12 towns (urban, village, university, etc) in CARLA v2. Its evaluation protocol requires E2E-AD models to pass 44 interactive scenarios under different locations and weathers which sums up to 220 routes and thus provides a comprehensive and disentangled assessment about their driving capability under different situations. We implement state-of-the-art E2E-AD models and evaluate them in Bench2Drive, providing insights regarding current status and future directions.
Test Automation for Interactive Scenarios via Promptable Traffic Simulation
Mondelli, Augusto, Li, Yueshan, Zanardi, Alessandro, Frazzoli, Emilio
Autonomous vehicle (A V) planners must undergo rigorous evaluation before widespread deployment on public roads, particularly to assess their robustness against the uncertainty of human behaviors. While recent advancements in data-driven scenario generation enable the simulation of realistic human behaviors in interactive settings, leveraging these models to construct comprehensive tests for A V planners remains an open challenge. In this work, we introduce an automated method to efficiently generate realistic and safety-critical human behaviors for A V planner evaluation in interactive scenarios. W e parameterize complex human behaviors using low-dimensional goal positions, which are then fed into a promptable traffic simulator, ProSim, to guide the behaviors of simulated agents. T o automate test generation, we introduce a prompt generation module that explores the goal domain and efficiently identifies safety-critical behaviors using Bayesian optimization. W e apply our method to the evaluation of an optimization-based planner and demonstrate its effectiveness and efficiency in automatically generating diverse and realistic driving behaviors across scenarios with varying initial conditions.
- Europe > Switzerland > Zürich > Zürich (0.40)
- North America > United States > Montana (0.04)
Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
In an era marked by the rapid scaling of foundation models, autonomous driving technologies are approaching a transformative threshold where end-to-end autonomous driving (E2E-AD) emerges due to its potential of scaling up in the data-driven manner. However, existing E2E-AD methods are mostly evaluated under the open-loop log-replay manner with L2 errors and collision rate as metrics (e.g., in nuScenes), which could not fully reflect the driving performance of algorithms as recently acknowledged in the community. For those E2E-AD methods evaluated under the closed-loop protocol, they are tested in fixed routes (e.g., Town05Long and Longest6 in CARLA) with the driving score as metrics, which is known for high variance due to the unsmoothed metric function and large randomness in the long route. Besides, these methods usually collect their own data for training, which makes algorithm-level fair comparison infeasible. To fulfill the paramount need of comprehensive, realistic, and fair testing environments for Full Self-Driving (FSD), we present Bench2Drive, the first benchmark for evaluating E2E-AD systems' multiple abilities in a closed-loop manner.
Real-world Troublemaker: A 5G Cloud-controlled Track Testing Framework for Automated Driving Systems in Safety-critical Interaction Scenarios
Zhang, Xinrui, Xiong, Lu, Zhang, Peizhi, Huang, Junpeng, Ma, Yining
--Track testing plays a critical role in the safety evaluation of autonomous driving systems (ADS), as it provides a real-world interaction environment. However, the inflexibility in motion control of object targets and the absence of intelligent interactive testing methods often result in pre-fixed and limited testing scenarios. T o address these limitations, we propose a novel 5G cloud-controlled track testing framework, Real-world Troublemaker . This framework overcomes the rigidity of traditional pre-programmed control by leveraging 5G cloud-controlled object targets integrated with the Internet of Things (IoT) and vehicle teleoperation technologies. Unlike conventional testing methods that rely on pre-set conditions, we propose a dynamic game strategy based on a quadratic risk interaction utility function, facilitating intelligent interactions with the vehicle under test (VUT) and creating a more realistic and dynamic interaction environment. The proposed framework has been successfully implemented at the T ongji University Intelligent Connected V ehicle Evaluation Base. Field test results demonstrate that Troublemaker can perform dynamic interactive testing of ADS accurately and effectively. Compared to traditional methods, Troublemaker improves scenario reproduction accuracy by 65.2%, increases the diversity of interaction strategies by approximately 9.2 times, and enhances exposure frequency of safety-critical scenarios by 3.5 times in unprotected left-turn scenarios. Index T erms --Automated driving systems, track testing, 5G, cloud-controlled object targets, interaction scenarios. HE safety of automated driving systems (ADS) must be ensured prior to their practical implementation, which requires a well-established testing framework [1]. Existing test standards, such as ISO 26262 [2], UN R157 [3], and UN R171 [4], are not sufficient to comprehensively evaluate ADS. According to the driving automation levels defined by SAE J3016 from SAE International, a high-level ADS (i.e., Level 3 or higher) is expected to perform driving tasks independently and autonomously, with the driver no longer retaining continuous control over vehicle movement [5]. While ADS has already been deployed in various countries and regions, numerous ADS traffic incidents highlight that safety testing for high-level ADS remains a critical technical challenge. In comparison to traditional vehicles and advanced driver assistance systems (ADAS), high-level ADS testing faces significant transformations and challenges, particularly in terms of both test subjects and requirements.
- Europe > Switzerland > Geneva > Geneva (0.04)
- Europe > Norway > Eastern Norway > Innlandet > Hamar (0.04)
- Transportation > Ground > Road (1.00)
- Information Technology (1.00)
- Automobiles & Trucks (1.00)
Surprise Potential as a Measure of Interactivity in Driving Scenarios
Ding, Wenhao, Veer, Sushant, Leung, Karen, Cao, Yulong, Pavone, Marco
Validating the safety and performance of an autonomous vehicle (AV) requires benchmarking on real-world driving logs. However, typical driving logs contain mostly uneventful scenarios with minimal interactions between road users. Identifying interactive scenarios in real-world driving logs enables the curation of datasets that amplify critical signals and provide a more accurate assessment of an AV's performance. In this paper, we present a novel metric that identifies interactive scenarios by measuring an AV's surprise potential on others. First, we identify three dimensions of the design space to describe a family of surprise potential measures. Second, we exhaustively evaluate and compare different instantiations of the surprise potential measure within this design space on the nuScenes dataset. To determine how well a surprise potential measure correctly identifies an interactive scenario, we use a reward model learned from human preferences to assess alignment with human intuition. Our proposed surprise potential, arising from this exhaustive comparative study, achieves a correlation of more than 0.82 with the human-aligned reward function, outperforming existing approaches. Lastly, we validate motion planners on curated interactive scenarios to demonstrate downstream applications.
AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
Mou, Xinyi, Liang, Jingcong, Lin, Jiayu, Zhang, Xinnong, Liu, Xiawei, Yang, Shiyue, Ye, Rong, Chen, Lei, Kuang, Haoyu, Huang, Xuanjing, Wei, Zhongyu
Large language models (LLMs) are increasingly leveraged to empower autonomous agents to simulate human beings in various fields of behavioral research. However, evaluating their capacity to navigate complex social interactions remains a challenge. Previous studies face limitations due to insufficient scenario diversity, complexity, and a single-perspective focus. To this end, we introduce AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios. Drawing on Dramaturgical Theory, AgentSense employs a bottom-up approach to create 1,225 diverse social scenarios constructed from extensive scripts. We evaluate LLM-driven agents through multi-turn interactions, emphasizing both goal completion and implicit reasoning. We analyze goals using ERG theory and conduct comprehensive experiments. Our findings highlight that LLMs struggle with goals in complex social scenarios, especially high-level growth needs, and even GPT-4o requires improvement in private information reasoning. Code and data are available at \url{https://github.com/ljcleo/agent_sense}.
Imagined Potential Games: A Framework for Simulating, Learning and Evaluating Interactive Behaviors
Sun, Lingfeng, Wang, Yixiao, Hung, Pin-Yun, Wang, Changhao, Zhang, Xiang, Xu, Zhuo, Tomizuka, Masayoshi
Interacting with human agents in complex scenarios presents a significant challenge for robotic navigation, particularly in environments that necessitate both collision avoidance and collaborative interaction, such as indoor spaces. Unlike static or predictably moving obstacles, human behavior is inherently complex and unpredictable, stemming from dynamic interactions with other agents. Existing simulation tools frequently fail to adequately model such reactive and collaborative behaviors, impeding the development and evaluation of robust social navigation strategies. This paper introduces a novel framework utilizing distributed potential games to simulate human-like interactions in highly interactive scenarios. Within this framework, each agent imagines a virtual cooperative game with others based on its estimation. We demonstrate this formulation can facilitate the generation of diverse and realistic interaction patterns in a configurable manner across various scenarios. Additionally, we have developed a gym-like environment leveraging our interactive agent model to facilitate the learning and evaluation of interactive navigation algorithms.
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Asia (0.04)
- Transportation (0.68)
- Leisure & Entertainment > Games (0.67)
Adaptive Decision-Making for Autonomous Vehicles: A Learning-Enhanced Game-Theoretic Approach in Interactive Environments
Huang, Heye, Liu, Jinxin, Shi, Guanya, Zhao, Shiyue, Li, Boqi, Wang, Jianqiang
This paper proposes an adaptive behavioral decision-making method for autonomous vehicles (AVs) focusing on complex merging scenarios. Leveraging principles from non-cooperative game theory, we develop a vehicle interaction behavior model that defines key traffic elements and integrates a multifactorial reward function. Maximum entropy inverse reinforcement learning (IRL) is employed for behavior model parameter optimization. Optimal matching parameters can be obtained using the interaction behavior feature vector and the behavior probabilities output by the vehicle interaction model. Further, a behavioral decision-making method adapted to dynamic environments is proposed. By establishing a mapping model between multiple environmental variables and model parameters, it enables parameters online learning and recognition, and achieves to output interactive behavior probabilities of AVs. Quantitative analysis employing naturalistic driving datasets (highD and exiD) and real-vehicle test data validates the model's high consistency with human decision-making. In 188 tested interaction scenarios, the average human-like similarity rate is 81.73%, with a notable 83.12% in the highD dataset. Furthermore, in 145 dynamic interactions, the method matches human decisions at 77.12%, with 6913 consistence instances. Moreover, in real-vehicle tests, a 72.73% similarity with 0% safety violations are obtained. Results demonstrate the effectiveness of our proposed method in enabling AVs to make informed adaptive behavior decisions in interactive environments.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (15 more...)
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks > Manufacturer (0.67)
- Education (0.67)
DiPA: Probabilistic Multi-Modal Interactive Prediction for Autonomous Driving
Knittel, Anthony, Hawasly, Majd, Albrecht, Stefano V., Redford, John, Ramamoorthy, Subramanian
Accurate prediction is important for operating an autonomous vehicle in interactive scenarios. Prediction must be fast, to support multiple requests from a planner exploring a range of possible futures. The generated predictions must accurately represent the probabilities of predicted trajectories, while also capturing different modes of behaviour (such as turning left vs continuing straight at a junction). To this end, we present DiPA, an interactive predictor that addresses these challenging requirements. Previous interactive prediction methods use an encoding of k-mode-samples, which under-represents the full distribution. Other methods optimise closest-mode evaluations, which test whether one of the predictions is similar to the ground-truth, but allow additional unlikely predictions to occur, over-representing unlikely predictions. DiPA addresses these limitations by using a Gaussian-Mixture-Model to encode the full distribution, and optimising predictions using both probabilistic and closest-mode measures. These objectives respectively optimise probabilistic accuracy and the ability to capture distinct behaviours, and there is a challenging trade-off between them. We are able to solve both together using a novel training regime. DiPA achieves new state-of-the-art performance on the INTERACTION and NGSIM datasets, and improves over the baseline (MFP) when both closest-mode and probabilistic evaluations are used. This demonstrates effective prediction for supporting a planner on interactive scenarios.
- Transportation > Ground > Road (1.00)
- Automobiles & Trucks (0.83)
- Information Technology (0.64)
Continual Interactive Behavior Learning With Traffic Divergence Measurement: A Dynamic Gradient Scenario Memory Approach
Lin, Yunlong, Li, Zirui, Gong, Cheng, Lu, Chao, Wang, Xinwei, Gong, Jianwei
Developing autonomous vehicles (AVs) helps improve the road safety and traffic efficiency of intelligent transportation systems (ITS). Accurately predicting the trajectories of traffic participants is essential to the decision-making and motion planning of AVs in interactive scenarios. Recently, learning-based trajectory predictors have shown state-of-the-art performance in highway or urban areas. However, most existing learning-based models trained with fixed datasets may perform poorly in continuously changing scenarios. Specifically, they may not perform well in learned scenarios after learning the new one. This phenomenon is called "catastrophic forgetting". Few studies investigate trajectory predictions in continuous scenarios, where catastrophic forgetting may happen. To handle this problem, first, a novel continual learning (CL) approach for vehicle trajectory prediction is proposed in this paper. Then, inspired by brain science, a dynamic memory mechanism is developed by utilizing the measurement of traffic divergence between scenarios, which balances the performance and training efficiency of the proposed CL approach. Finally, datasets collected from different locations are used to design continual training and testing methods in experiments. Experimental results show that the proposed approach achieves consistently high prediction accuracy in continuous scenarios without re-training, which mitigates catastrophic forgetting compared to non-CL approaches. The implementation of the proposed approach is publicly available at https://github.com/BIT-Jack/D-GSM
- Asia > China > Beijing > Beijing (0.05)
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
- Transportation > Ground > Road (1.00)
- Health & Medicine (1.00)